Wide spread spectrum watermarking with side information 

and interference cancellation 



Gaetan Le Guelvouit and Stephane Pateux 
IRISA/INRIA, Campus de Beaulieu, 35042 Rennes Cedex, FRANCE 

ABSTRACT 

Nowadays, a popular method used for additive watermarking is wide spread spectrum. It consists in adding 
a spread signal into the host document. This signal is obtained by the sum of a set of carrier vectors, which 
are modulated by the bits to be embedded. To extract these embedded bits, weighted correlations between the 
watermarked document and the carriers are computed. Unfortunately, even without any attack, the obtained 
set of bits can be corrupted due to the interference with the host signal (host interference) and also due to 
the interference with the others carriers (inter-symbols interference (ISI) due to the non-orthogonality of the 
carriers). Some recent watermarking algorithms deal with host interference using side informed methods, but 
inter-symbols interference problem is still open. In this paper, we deal with interference cancellation methods, 
and we propose to consider ISI as side information and to integrate it into the host signal. This leads to a great 
improvement of extraction performance in term of signal-to-noise ratio and/or watermark robustness. 
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1. INTRODUCTION 

First studies in robust watermarking were mostly empirical. The domain became more academic when the 
watermarking problem was considered as communication over a noisy channel : the watermark is a signal to be 
transmitted through a channel corrupted by noise due to the cover signal and attacks. Watermarking was then 
considered as a kind of channel coding. The latest contributions then focused on theoretical studies, inspired 
by information theory, but not usable as such. 

Due to constraints on the embedding distortion (MSE or weighted MSE) , the power of the transmitted signal 
is limited. The communication channel is noisy due to attacks. It has often been modeled as the addition of 
white Gaussian noise (AWGN channel). 1 ' 2 The host signal has then often been considered as a noise that 
limits the performance of the watermarking scheme. But recently, it has been shown that watermarking can 
be regarded as a problem of communication with side information 3 : a part of the added noise (i.e. the host 
signal) is perfectly known during the embedding process. Costa 4 studied this kind of channel and gave a 
limit of capacity, independant of of the host signal. He also exhibited a theoretical algorithm (the Ideal Costa 
Scheme) to reach this limit, considering i.i.d. Gaussian signals and AWGN transmission. However since this 
scheme relies on exhaustive search among codevectors, practical implementation of this scheme is not realistic. 
Some implementations inspired by the ICS were then proposed, using structured codebooks: Eggers's SCS 5 or 
syndrome based codes. 6 

Costa's scheme assumes i.i.d. Gaussian signals. Unfortunately, real multimedia signals are not so simple. 
Moreover, attacks may be not modeled as simple AWGN channels. Several studies proposed to considered 
non i.i.d. SAWGN* channels. 7-9 Indeed this class of attacks allows to take into account for filtering (such 
as Wiener filtering for noise removal), scaling, addition of noise correlated to the host signal, noise from 
compression. . . Furthermore, it has been shown 10 ' 11 that optimal attacks are of the kind SAWGN. In order to 
use ICS properties, watermarking in a linear subspace using wide spread spectrum (WSS) has been considered. 
While our previous work 12 assumes non i.i.d. Gaussian signals, thanks to the use of spread transform subspace, 
projected host signal and attack noise are i.i.d and Gaussian. Furthermore, this scheme leads to a practical 
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Figure 1. The watermarking channel seen as communication with side information. 

implementation with performances close to optimal. 13 However adding a watermark in this subspace introduces 
symbol interference due to the non-orthogonality of the carriers used for the spread transform. This ISI, like 
the host signal in non-informed watermarking, limits the performance of the scheme. 

This paper deals with a practical and complete informed watermarking scheme, using spread spectrum and 
structured codebooks. It also provides a solution to symbol interference cancellation. In Sec. 2, we recall the 
subspace-based approach and introduce a structured codebook based on punctured convolutional codes. In 
Sec. 3, we first study two ISI cancellation methods, and we then provide an iterative algorithm to consider 
symbol interference as side information, illustrated by experimental results in Sec. 4. We finally conclude this 
paper in Sec. 5. 

2. SPREAD SPECTRUM FOR SIDE INFORMED WATERMARKING 

We have shown in our previous work 12, 13 a practical scheme that achieves performances close to the optimal 
bounds. 10 The watermark is embedded in a linear subspace: i.i.d. Gaussian signals are then obtained and ICS 
can be applied. We first recall in this section the original Costa's approach. In order to render realistic ICS, 
we then introduce a structured codebook (dirty paper codes) based on convolutional codes. We finally describe 
our WSS-based embedding method, optimized using game theory (min-max optimization). 

2.1. Channel with side information: Costa's approach 

As seen in the introduction, the watermarking problem can be seen as a communication process with side 
information available at the encoder. 3 This kind of channel have been studied by Costa, 4 which leaded to an 
upper bound of capacity for this kind of channel. 

Let us consider a n-long i.i.d. Gaussian host signal x, whose samples are modeled by X ~ Af(0,Q). 
This signal is perfectly known during the embedding process. We transmit our data with a watermark signal 
w = {u>i, W2, ■ ■ ■ , w n } as seen on Fig. 1. The energy of w is bounded so that 
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The transmitted signal is then y = x + w. 
Gaussian noise z, modeled by Z ~ Af(0,N). 



This signal is corrupted during the transmission by an added 
Receiver then gets the signal y' = y + z. If we consider this 



channel as a classical Gaussian one, two noises are added to the transmitted signal, so the capacity is given by 
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Side information x impacts on the performance of the system, lowering the capacity. Costa showed that the 
side information does not influence the optimal capacity of the channel, i. e. 
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Figure 2. Perturbation of signal x when embedding a watermark associated to codevector u*. 

Ha gave a theoretical method to reach this value. He considered a signal U ~ A/"(0, P + a 2 Q), know both 
from the embedder and the extractor. The capacity of the channel is then given by 

C = max{I(U;Y)-I(U;X)}, (4) 

where Y ~ A/"(0, Q + P) models the transmitted signal y. Costa showed that the previous equation leads to 
the optimal value a = Pj(P + N), and then to Eqn. 3. The signal U is obtained using a structured codebook 
f 2™( / ( ,7 ' F ) -£ ) elements^, designed to be a surjective function between the set of possible messages to embed 
Ai and the codebook U: each possible message m is associated to a sub-codebook U m composed of 2 nI ( U]X > 
codewords. During the embedding process, the closest codeword u* £ Wm is chosen. The watermark signal is 
then given by 

w = u* — ax. (5) 

Whereas classical watermarking techniques would have transmitted x + VnP x u*/||u*||, the a term forces the 
transmitted signal to go toward the codevector, as illustrated by Fig. 2. At the extraction process, the closest 
codeword SeWis computed. The decoded message is then m so that u £ W~ . 

2.2. Dirty paper codes from punctured ones 

The original ICS is based on large random codebooks: the only way to decode y' is by an exhaustive search in 
IA. Some practical but suboptimal approaches, inspired by the ICS, have been proposed for i.i.d. Gaussian host 
signals, based on codebooks used for error correcting codes (ECC), 5, 14-16 where decoding process is designed 
to be much more simpler than an exhaustive search. 

Each possible fc-long message is associated to 2 n/ ( £/;X ) codewords. A simple way to design such a structured 
codebook would be to insert i = nx I(U ; X) index bits in the message and to encode it. For an ECC with rate 
r, this leads to n-long codewords with n = (k + i)/r (see Fig. 3). According to Costa, the value of i is given by 
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Figure 3. Adding i index bits to design a structured codebook (fc = 8, i = 4, r = 1/2 leading to n — 24). 

which then depends on Q, i.e. the host signal. Thus the final codeword length (k + i)/r may vary, while the 
host signal length n is generally given and fixed (number of pixels for an image, sample size of a sound. . . ). 
The length of codewords must not depend on i, i.e. the global rate k/n must be fixed. 

We thus propose to use a simple codebook based on punctured convolutional codes and soft trellis decoding. 
Let us choose an error correcting code in order to get a rate r = k/n. We then design an interleaved pattern 
composed of the k bits from the message m to be embedded and of i additional bits, as illustrated in Fig. 4(a). 
We then expand the host signal from n to (k+i)/r using neutral values for soft decoding {i.e. 0). This expanded 
host signal is decoded with a modified soft Viterbi decoding algorithm, using the previous k bits pattern as a 
strong a priori in order to force some transitions in the convolutional trellis (see Fig. 4(b)). The output fixes 
the i index bits and gives a (k + i)/r-long codeword, which is punctured according to the previous expansion 
of the host signal, in order to remove i/r bits and to finally get a n-long codeword. This leads to the closest 
codeword u* G U m to x- Using BPSK*, all the obtained codewords are designed to have the same energy, i.e. 

H| = ^l. 

The watermark is finally chosen in order to get the maximum robustness 17 : the codeword u* is associated 
to a hyper-cone of robustness, where y must lie into to be correctly decoded. Further, hyperboloids may be 
defined to represent set of points of given robustnesses {e.g. Hni, 7~Ln 2 - ■ ■ on Fig- 2). The watermark w is 
defined in order to maximize robustness, that is 
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where 9 is the angle of the hyper-cone, given by 
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At the receiver, the signal y' = y + z is expanded from n to (k + i)/r elements insertion elements, and 
decoded using the trellis to get m. Thanks to soft decoding and to the fact that codewords have all same 
energy, this coding scheme is scale resistant, i.e. y' can be scaled (y' = 7 [y + z] with 7 > 0) without loss of 
robustness. 

2.3. Game theory applied to spread spectrum 

Multimedia signals are not usually i.i.d. and Gaussian. So we consider a non i.i.d. host signal x modeled by a 
set of random variables X m = {Xi, X2, ■ ■ ■ , X m } with X; ~ jV(0, a\ . ), i.e. signal is modeled as a mixture of 
Gaussians. We also consider a more general model for attack: SAWGN. The received signal can then be written 
as y'i = jf xj/j + Zi where > and 2$ is a Gaussian noise modeled by ~ A/"(0, <r% ). To embed a n symbols 
length message in a m-long signal, wide spread spectrum uses a pseudo-random matrix G G {—1; l} mxn . This 
can be associated to a spread transform, i.e. the embedding process is made in a linear subspace, like for 
ST-DM 15 or ST-SCS. 9, 18 Our previous work 12, 19 demonstrated the interest of Wiener filtering at embedding^: 

W r 1 1 \ 

Vi = 7,: Fi + Wt\ = 7i 



' Binary Phase Shift Keying. 

''Since attacker would perform Wiener filtering to decrease D xy r, Wiener filtering at embedding allows to decrease 
D xy without loss of performance. 
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(b) Decoding the expanded host signal using a modified Viterbi algorithm. 
Figure 4. The search for the closest codeword at the embedding stage (k = 8, i = 4, n = 16 and r = 1/2). 
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The watermark w = {w\,W2, ■ ■ ■ ,w m } is thus non i.i.d. and is modeled by W m with Wi ~ A/^O, o™,. ). The 
Wiener filtering and the scale attack can be grouped: 7, = 7* x 7^. The inverse spread transform (used for 
extraction) is defined by a weighted linear correlation 12 : 
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where j3i is a weighting factor. As demonstrated previously 13 considering a SI scheme, the optimal value for /?$ 
can be expressed as 

(13) 



In this subspace, the embedding process from Eqn. (9) is written as 

Vje{l,2,...,n}, yf = xf +wf, 

where x s ^ is i.i.d. and Gaussian. We can then use Costa's approach described in Sec. 2.1, and define from 
Eqns. (9) and (12) the different amounts of energy used as 
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We remark that while a\. 3> cr 2 ^. to ensure the invisibility of the watermark, the available watermark energy 
P is concentrated in the subspace and can then become more important than the energy Q of the host signal 
(when m/n 3> 1). It also shows that P is shared by the symbols to be embedded: more symbols (i.e. larger n) 
means less watermark energy per symbol. 

Given a maximum amount of embedding distortion, we must optimize the embedding energy, i.e. <JWi- 
Define an embedding and an attack distortion functions: 
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where (ft is a perceptual weighting factor. The performance of the inverse spread transform can be quantified 
by the signal-to- noise ratio Eb/N defined as 
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It should be noted that this value is not the signal-to-noise ratio obtained at the output of the extractor from 
Eqns. (12) and (13), given by 13 
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We now solve the optimization of aw t using a min-max game: given a maximal amount of distortion D' x 
the attacker wants to minimize Eb/Ng, while the embedder wants to maximize it, for a maximal amount of 
embedding distortion D"^ ax . This is done by two Lagrangian optimizations. 12 First, for the attacker, we get 
the following functional: 
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where A is a Lagrangian multiplier used to respect the constraint on the attack distortion. This leads to the 
optimal values for 7$ and az t ■ 
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The second part of the game consists in optimizing the embedding parameters considering optimal attack, 
which is also done by a Lagrangian approach: 
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where \ is a Lagrangian multiplier used to respect the constraint on the embedding distortion. This leads to 
the final optimal embedding parameters 
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Figure 5. Signal- to- noise ratio against AWGN attack, for the classical image Lena (n = 162, m = 512 x 512 and 
E[a Wi ] = 2.5 VzG {1, 2, . . . , m}). 

and also to a particular expression for the optimal correlation factor for inverse spread transform: /3* oc ipi, 
when considering optimal attacks. 

3. INTERFERENCE CANCELLATION 

Without side informed watermarking, the signal-to-noise ratio we get is given by E^/Nq = P/(Q + N). In 
practice, this value is correct only if the carriers G = {G , G x , . . . , G n j of the spread transform are truly 
orthogonal, which is not the case with pseudo-random carriers. Thus the signal-to-noise ration is given by 

Et = P 
N Q + N + I 

in 2 

where/ = £#7? x ^ (n - 1) 

The value / is known as the inter-symbols interference. In non-informed watermarking techniques (where the 
host signal influences the performance of the scheme), this interference is negligible because Q ^> I. But in 
informed watermarking, for a low level of attack, it represents a great amount of noise that limits the robustness 
and/or the capacity of the scheme. Fig. 5 illustrates the gap between WSS watermarking with pseudo-random 
carriers and theoretical WSS watermarking (with truly orthogonal carriers). We will see in the remaining part 
of this section three methods to cancel this interference. 

3.1. Insuring orthogonality of the carriers 

To avoid interference, a trick is to embed only one symbol per host element, 15 i.e. Vi G {1, 2, . . . to}, there is 
only one element in {Gj,i, Gi,2) ■ ■ ■ ,Gi tn } which is not set to 0. In this case, 1 = 0. However this technique 
limits the spreading of the bits, especially for important values of n, case where interference cancellation is very 
interesting (low level of attack noise). 

Moreover, in the case of smooth signals (see Fig. 6 for an example), the number of well suited host elements 
(important value of "fiPWi/cXi) is limited. The energy of the watermark is mainly located on high energy 
coefficients. Since this number of coefficients is small, symbols to be hidden may not be equally spread over the 
host signal (i.e. a symbol may be spread on non significant coefficients whereas an other one will be spread on 
significant ones). It results in the linear subspace in non i.i.d. signals. Performances are not guaranteed and 
parallel channels should rather be considered. 

3.2. Cancellation at the decoder 

If pseudo-random carriers were used at embedding, the received signal in the spread transform subspace can 
be written as 

y' st = 7 
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(a) Original image. (b) Tree-levels DWT transform of 

Artie hare. 

Figure 6. Artie hare, a difficult image to watermark: the number of interesting elements is limited (copyright photos 
courtesy of Robert E. Barber, Barber Nature Photography). 

where isi(w s ^) is the inter-symbols interference. For Q <^P (very common case for payloads such as n < 1000), 
we can write x s ^ + w* ~ w s * ~ yfP x u* and then 



y' st -7 



u* + isi(VP x u*)l +z st . (30) 



Thus we can estimate isi(vf xu*) in order to cancel ISI. Receiver first estimates u* . Corresponding interference 
is then canceled. The new y /S ^ is obtained and used to compute u*. This process iterates until u* = u* (see 
Alg. 1). To be efficient, receiver must know (or estimate) embedding energy aw { - Moreover, optimal scaling 
factor 7* must also be estimated. This may be done by an additional reference signal, leading to a lower 
capacity for message bits. We will then search for another solution consisting in canceling ISI at embedding. 

3.3. Interference as side information 

As seen in Sec. 2.1, the use of the side information available during the embedding process leads to great 
improvements, and if no attack is applied during the transmission, the capacity of the channel is infinite. But 
the spread transform we use to embed the watermark introduces a noise that limits capacity due to ISI. We 
propose to consider ISI as a kind of side information. 

Even if this interference is introduced by the embedder, it is not perfectly known before the embedding. 
So, it can not be directly considered as side information. The problem is that the interference depends on the 
watermark signal, which depends on the interference. We use an iterative algorithm to converge to a watermark 
signal that takes into account its own interference. This algorithm is described by Alg. 2. We first compute 
w s ^, as explained in Sec. 2. The interference it produces is computed, and introduced as side information. In a 
second step, this new side information x s * is used to compute an updated watermark signal. The previous steps 
are iterated until w s ^ converges (we observe convergence is attained to after typically less than 3 iterations). 
At the end of the loop, the watermark signal takes into account the host signal and the symbol interference, 
and is added using Eqn. (9). 



4. EXPERIMENTAL RESULTS 

The previous studies have been applied to image watermarking. A 3-levels wavelet transform of a gray-scale 
image generates the host signal x (m is equal to the number of pixels of the host image). We embed k = 64 
bits using a structured codebook, as described in Sec. 2.2, with a rate equal to 1/2. This leads to n — 132 with 
some padding bits. We consider a psycho-visual factor inspired from Watson's 20 model, defined by 
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Algorithm 1 Considering w s ^ ~ y/P x u*, search for the closest codeword u* from y' st with ISI canceled 
for j = 1 to n do 
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end for 

u* <— closest codeword to y' s * 
until u* = u* 
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(a) Performance against AWGN attack. (b) Performance against JPEG lossy com- 
pression (D xy i = 7 for 100 % JPEG quality, 
and D xy , ~ 20 for 15 % JPEG quality). 

Figure 7. Signal-to-noise ratio against attacks for Lena (512 x 512 gray-scale image, 3-levels DWT, n = 132 and 
D xy = 7). 



where p is set to get E [tpi] — 1 and 77x~ is a normalized activity measure (based on the variance of Xi). We 
finally tune A and x to obtain an embedding distortion equal to 7.0 (i.e. wpsnr(x, y) = 39.7 dB). Two attacks 
are tested: Gaussian noise and JPEG lossy compression. 

For each attack level (energy of the added noise for AWGN attack and quality factor for JPEG compression), 
the resulting distortion D xy i is computed and the watermark is extracted to get the signal-to-noise ratio E^/Nq. 
Figs. 7 and 8^ confirm the interest of interference cancellation, already shown by the theoretical Fig. 5. 



^Both used images are available from F. Petitcolas' web site: <http://www.cl.cam.ac.uk/~fapp2/watermarking/ 
image_database> . 



Algorithm 2 Calculate w s ^ considering ISI as side information 



for j = 1 to n do 
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u* <— closest codeword to x s * 
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Figure 8. Signal-to-noise ratio against attacks for Paper machine (512 x 512 gray-scale image, 3-levels DWT, n = 132 
and D xy = 7) . 



5. CONCLUSION 



We studied in this paper a practical implementation of a watermarking scheme exploiting side information. 
We propose a method scheme based on a simple structured codebook using a soft Viterbi decoder. A spread 
transform gets i.i.d. signals from non i.i.d. ones. Embedding in the linear subspace defined by the spread 
transform generates inter-symbols interference. An iterative algorithm estimates this interference and includes 
it into the side information. We finally applied this scheme to image watermarking: this leads to important 
improvements in term of capacity and/or robustness. 
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